Access Rights in Enterprise Full-text Search
نویسندگان
چکیده
One of the toughest problems to solve when deploying an enterprise-wide full-text search system is to handle the access rights of the documents and intranet web pages correctly and effectively. Post-processing the results of general-purpose fulltext search engine (filtering out the documents inaccessible to the user who sent the query) can be an expensive operation, especially in large collections of documents. We discuss various approaches to this problem and propose a novel method which employs virtual tokens for encoding the access rights directly into the search index. We then evaluate this approach in an intranet system with several millions of documents and a complex set of access rights and access rules. 1
منابع مشابه
A Framework for Visual Search in Broadcast archives
In today’s digital age, the ability to access, analyze and (re)use ever-growing amounts of data is a strategic asset for the broadcasting and media industry. Despite the growing interest around new technologies, archive’s search and retrieval operations are still usually done by means of text-based search over tags and metadata of manually pre-annotated material. This is particularly true becau...
متن کاملAccessing Full Text of Articles: A Study on the Status of Medical Universities in Tehran
Introduction. Due to the rapid development of information technology and world wide web, there is easy and fast access to medical information and medical journals. Although there is free and easy access to articles' abstracts through Medline on the internet, accessing full text articles still remains a problem. This study was carried out to investigate the best way we could access full text of ...
متن کاملAn architecture to search private data using arbitrary assessment algorithms without disclosing their content
This paper describes an architecture which allows an un-trusted search algorithm supplied by a User to be granted access to private, non-disclosable data of a Data Provider. The result is that the User can receive the output of the algorithm, a relevance rating or other useful statistic, but cannot ever access the data itself. Some support is required by the Data Provider, who actively sets up ...
متن کاملDEDUCE Clinical Text: An Ontology-based Module to Support Self-Service Clinical Notes Exploration and Cohort Development.
Large amounts of information, as well as opportunities for informing research, education, and operations, are contained within clinical text such as radiology reports and pathology reports. However, this content is less accessible and harder to leverage than structured, discrete data. We report on an extension to the Duke Enterprise Data Unified Content Explorer (DEDUCE), a self-service query t...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کامل